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r— I, Abstract 

A general method for deriving closed reduced models of Hamiltonian dynamical 
jJIh ■ systems is developed using techniques from optimization and statistical estimation. 

d . As in the standard projection operator methodology of statistical mechanics, a set 

of resolved variables is selected to capture the slow, macroscopic behavior of the 
system, and the family of quasi-equilibrium probability densities on phase space 
corresponding to these resolved variables is employed as a statistical model. The 
I macroscopic dynamics of the mean resolved variables is determined by optimizing 

^ ■ over paths of these probability densities. Specifically, a cost function is introduced 

\^ ■ that quantifies the lack-of-fit of a feasible path to the underlying microscopic dy- 

CN I namics; it is an ensemble-averaged, squared-norm of the residual that results from 

\ submitting a path of trial densities to the Liouville equation. The evolution of the 

macrostate is estimated by minimizing the time integral of the cost function over 
such paths. Thus, the defining principle for the reduced model takes the form of 
Hamilton's principle in mechanics, in which the Lagrangian is the cost function and 
the configuration variables are the parameters of the statistical model. The value 
function for this optimization, which plays the role of the action integral, satisfies 
I the associated Hamilton-Jacobi equation, and it determines the optimal relation be- 

tween the statistical parameters and the irreversible fiuxes of the resolved variables, 
thereby closing the reduced dynamics. The resulting equations for the macroscopic 
variables have the generic form of governing equations for nonequilibrium thermo- 
dynamics, and they furnish a rational extension of the classical equations of linear 
irreversible thermodynamics beyond the near-equilibrium regime. In particular, the 
value function is a thermodynamic potential that extends the classical dissipation 
function and supplies the nonlinear relation between thermodynamics forces and 
fluxes. 
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1 Introduction 



A main goal of nonequilibrium statistical mechanics is the derivation of effective equa- 
tions for macroscopic behavior from the equations of motion that govern an underlying 
microscopic dynamics [2, 16, 25, 26, 39, 40, 45]. Unlike equilibrium statistical mechan- 
ics, which rests on the firm foundation of conserved variables and Gibbsian ensembles, 
this endeavor is necessarily imperfect and approximate. First, it relies on the choice of 
relevant macroscopic variables, which is not universal. Even though in traditional set- 
tings such as hydrodynamics, the local equilibrium hypothesis can be invoked to justify 
the use of locally conserved quantities, in many situations it may be advantageous to 
consider reduced or extended sets of relevant variables [24]. Second, the selection of a 
macroscopic description depends on the existence of a separation of time scales between 
the relevant variables that evolve on a slow time scale and the remaining microscopic 
variables that are expected to equilibrate on fast time scales. But, in some problems for 
which a statistical mechanical description is sought, such as fluid turbulence, a wide sep- 
aration between fast and slow variables may not be present [32]. Third, when equations 
of motion for the relevant variables are derived by standard projection operator methods, 
they contain finite-memory effects and non-Markovian stochastic forcing terms, which are 
clumsy to handle until further approximations or limits are adopted to bring them to more 
tractable forms [11, 12]. As a consequence, most statistical treatments of nonequilibrium 
phenomena are restricted to idealized physical models having special microscopic dynam- 
ics. Often a key simplification is to replace a deterministic microscopic dynamics by a 
stochastic process at the outset of the analysis, constructing the process to be compatible 
with equilibrium fluctuations [19]. 

In the present paper we take a different approach, in that we seek to characterize 
the governing equations for a given set of relevant, or resolved, variables by applying 
a statistical model reduction procedure to the underlying deterministic dynamics itself, 
which we take to be a general Hamiltonian system having finitely-many degrees of free- 
dom. The central idea in our approach is a dynamical optimization principle over paths 
of macrostates, which may be outlined as follows, (i) Relative to a given set of resolved 
variables, a parametric statistical model of the system is imposed using quasi-equilibrium 
probability densities with parameters conjugate to the expectations of the resolved vari- 
ables, (ii) A macroscopic evolution is identified with a path in the parameter space of these 
trial probability densities, and the lack-of-fit of such a path to the underlying Hamiltonian 
dynamics is quantified by a certain cost functional, which is a time-integrated, weighted, 
squared norm of the residual of those densities with respect to the Liouville equation, 
(ill) The estimated, or predicted, evolution of the macrostate is determined by that path 
which minimizes the lack-of-fit cost functional. In this way we derive an irreversible, dis- 
sipative dynamics that is closed in the macrovariables and is optimally compatible with 
the underlying conservative, reversible microdynamics in a quantified, statistical sense. 

A particularly desirable feature of the closed reduced equations derived via our ap- 
proach is that they have the structure of the general equations for nonequilibrium ther- 
modynamics proposed by contemporary researchers [35, 21, 38, 37, 4, 3]. In particular, 
our model reduction strategy may be viewed as a dynamical variational principle that 



2 



justifies the so-called GENERIC (General Equations of NonEquilibrium Reversible Irre- 
versible Coupling) format for out-of-equilibrium dynamics [37]. In the notation used in 
the body of our paper, these equations have the form 

doi, _ y-^ dh_ _ dv_ , , 

dt ~ ^ ''da. d\^' ^ > 

in which a = (ai, . . . , a™) denotes the macrostate vector, which is the expectation of the 
given vector of resolved variables A = {Ai, . . . ,A„i) defined on the phase space of the 
Hamiltonian dynamics. The vector A = (A^, . . . , A"') consists of parameters conjugate to 
a, given by A* = —ds/dai, where s — s{a) is the entropy of the macrostate a. The first 
term in (1) is a generalized Hamiltonian vector field having an energy function h — h{s, a) 
and a Poisson matrix Q = fl(a), which is anti-symmetric; the partial derivatives dh/dai 
are at constant entropy s = s{a). This term governs the reversible part of the reduced 
dynamics. The second term in (1) is a gradient vector field with respect to the parameter 
vector A having the potential function v — v{\). This term governs the irreversible part 
of the reduced dynamics. In our variational formulation the thermodynamic potential 
v{X) is the value function for the optimization principle; that is, v{X) equals the optimal 
value of the lack-of-fit cost functional in the minimization over paths emanating from the 
macrostate with parameter vector A and tending to equilibrium in infinite time. The value 
function therefore solves the time-independent Hamilton-Jacobi equation associated with 
optimization principle. The functions h{s,a) and Q(a) in the reversible term are entirely 
dictated by the form of the trial probability densities defining the statistical model. By 
contrast, the value function ■;;(A) depends on the weights introduced into the lack-of- 
fit cost function, and these weights determine the adjustable parameters of the ensuing 
closure theory. 

Under the near-equilibrium approximation, the closed reduced equations (1) take the 
classical form of linear irreversible thermodynamics [15]. Namely, a linear system of 
differential equations relates the fluxes dui/dt to the affinities (or thermodynamic forces) 
—A-' ; namely, 

^ = E [ - M^, ]X\ a, = E QjX^ ■ (2) 

j j 

The coefficients in these equations are evaluated at equilibrium, a = 0; /3 is the inverse 
temperature, and the matrix (C^) is the inverse of the matrix —{d^s/daidaj). In this 
formula M is the symmetric and positive-definite matrix that defines the near-equilibrium 
value function, which is the quadratic form 

The physical meaning of the value function becomes transparent from these identities, 
which imply that ds/dt = 2v{X). That is, the optimal value of the cost functional coincides 
with half the entropy production along the best-fit path. 
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Beyond the regime in which these near equihbrium approximations hold, the closed 
reduced equations (1) remain valid as the nonlinear equations that determine the best 
fit of the statistical model to the underlying dynamics. The utihty of these equations 
depends on the choice of the resolved variables and the separation of time scales between 
the resolved and unresolved variables in the statistical model, not on the closeness of the 
modeled macrostates to equilibrium. In the far-from-equilibrium regime the value function 
v{X) furnishes a natural extension of the classical dissipation function that is justified by 
a statistical mechanical derivation. Accordingly, our optimization-based derivation of the 
GENERIC theory offers both a new interpretation of the potential for the irreversible part 
of the governing equations (1), and a systematic way to calculate nonlinear corrections to 
the classical linear equations (2). 

The paper is organized as follows. Section 2 presents the background on the general 
closure problem and introduces the family of trial probability densities that constitute 
the statistical model. In Section 3 the stationary version of the defining optimization 
principle is formulated and the general nonequilibrium equations for the best-fit reduced 
model are derived. Properties of these equations are discussed in Section 4, as well as the 
near-equilibrium linearization of the reduced model. In Section 5 the theory is extended 
to include an initial period of development during which the entropy production increases 
from zero. This plateau effect is encountered when the specified initial statistical state 
is itself a quasi-equilibrium ensemble. The value function then becomes time- dependent, 
V = v{X,t), and solves a time-dependent Hamilton- Jacobi equation. Section 6 discusses 
the conceptual framework of the best-fit approach to closure and points to some specific 
test cases and applications. 

2 Statistical model and trial densities 

The underlying microscopic dynamics is assumed to be a Hamiltonian system. In canon- 
ical form the equations of motion are 

J = JVM^) with ^ = ( o ) ' 

where z = {q,p) denotes a generic point in the phase space r„ = M^"', n being the number 
of degrees of freedom of the system [1, 30]. Our theoretical development permits general 
systems of this kind, with time-independent, smooth Hamiltonians H on r„. In fact, 
our development only makes use of the general structure of the underlying microscopic 
dynamics — essentially, the conservation of energy and phase volume. Consequently it 
also applies without fundamental modifications to dynamical systems in a noncanonical 
Hamiltonian form. It is many such noncanonical systems, however, there are conserved 
quantities other than H which play an important role in conditioning the statistical be- 
havior of the dynamics. A version of the theory appropriate to infinite-dimensional dy- 
namics, such as occur in continuum models, could also be formulated. But it would be 
necessary to carry out a limiting analysis of discretized models in order to justify the 
statistical description. For the sake of definiteness, therefore, we restrict our discussion 
to finite-dimensional, canonical systems (3). 
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We are interested in making predictive estimates of some relevant macroscopic dy- 
namical variables rather than following the details of the microscopic dynamics (3) itself. 
We therefore seek a statistical closure in terms of some resolved dynamical variables. In 
keeping with the general character of our approach, we allow any finite number of ar- 
bitrary resolved variables A^, k = 1, . . . ,m, assuming that each is a smooth real- valued 
function on r„ and the set Ai, . . . , A^ is linearly independent. We assemble them into 
the resolved vector A — {Ai, . . . , Am)- Typically, m <^n. 

The evolution of any dynamical variable, F, resolved or not, is determined by the 
equation 

where {F, if } = (VF)*JVii is the Poisson bracket associated with the canonical Hamil- 
tonian structure. Indeed, the statement that (4) holds for all smooth functions F on 
r„ is equivalent to the Hamiltonian dynamics (3). Fundamentally, the problem of clo- 
sure in terms of the resolved variables Ai, . . . , Am arises from the fact that, except under 
very special circumstances, the derived variables {Ai, H}. . . . , {Am, H} are not express- 
ible as functions of the resolved variables Ai, . . . , A^- This generic fact makes a statistical 
description of the unresolved variables necessary. The foundation for such a statistical 
description is provided by the Liouville equation, which governs the propagation of prob- 
ability under the phase flow. Namely, for probability measures on r„, p{dz, t) = p{z, t)dz, 
having smooth densities p with respect to the invariant, 2n-dimensional, phase volume 
element dz = dqdp, 

^ + Lp = inTnXR, (5) 

in which L = {■, H} denotes the Liouville operator. Given a density p{z,to) at an initial 
time (5) completely determines the density p{z, t) at any later time t, which is denoted 
formally by t) = e~*^*~*o)^ to). We denote the expectation of any dynamical variable 
F with respect to p{t) by 



{F\p{t)) = / F{z)p{z,t)dz. 



[This bracket notation is used for expectation throughout the paper.] The Liouville equa- 
tion is equivalent to the statement that 

f^{F\p{t)) = {LF\p{t)) (6) 

for every dynamical variable F. In particular, the evolution of the mean of the resolved 
vector, a(t) = {A\p{t)), is determined by the exact solution of (5). But the exact density 
p{z, t) evolving under the Liouville equation is highly intricate; it contains vastly more 
information than the resolved m- vector a{t); and, it is excessively expensive to compute 
numerically, since it requires the integration of the microscopic trajectories starting from 
each sample point in an initial ensemble that approximates the given initial density. 

For this reason we seek a statistical closure in terms of an analytically tractable family 
of trial probability densities having the same level of complexity as the resolved variables 
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themselves. In the nomenclature of statistical inference, we impose a parametric statistical 
model for which the resolved vector A is a minimal sufficient statistic [9, 28]. We denote 
this family of probability densities by p{z; A), using a parameter vector A = (A^, . . . , A"*) e 
M"^. The family is assumed to be regular, meaning that the densities depend smoothly 
on A, the score variables are defined by 

UW = (7) 

and the Fisher information matrix defined by 

C{X) = {U{X)U{Xr\p{X)) (8) 

is nonsingular. [Here and throughout the paper we use the notation M* to denote the 
transpose of any matrix M; in particular, a column vector V is taken to its dual row vector 
V*. Also, we use the shorthand notation d/dX — {d/dX^, . . . , d/dX'^).] The assumption 
that A is a minimal sufficient statistic for the family p(A) implies that there is a one-to- 
one correspondence between the expected resolved vector a = {A\p{X)) and the parameter 
vector A. Accordingly, each macrostate can be identified with a imique parameter vector 
A, which ranges over the configuration space i?™, or perhaps a subset of it. We assume 
that the parameterization in terms of A is arranged so that the origin, A = 0, is an 
equilibrium density, meaning that Lp{0) — 0. 

Our interest centers on modeling the relaxation of the macrostate, a{t) = {A\p{X{t))) , 
from a specified nonequilibrium macrostate at time t = towards equilibrium. Such 
a relaxation corresponds to a path X{t) in parameter space for which A(0) = Aq 7^ and 
X{t) -^0 ast ^ +00. 

Even though we formulate the best-fit closure concept in such a general setting, we con- 
centrate our attention in this paper on the model that uses the family of quasi-equilibrium, 
or quasi-canonical, densities [22, 23, 31, 43], 

p{z; (3, A) = exp[-^// + X*A - ^{(3, A)] , (9) 

in which 

A) = log / exp{-l3H + X*A)dz. (10) 

[Here and throughout, X*A = X]i AMj.] These densities depend on the inverse temperature 
(3 as well as the parameters A^, . . . , A™ associated with the nonconserved resolved variables. 
Some growth conditions on H and A at infinity may be required to ensure that the 
normalization (10) exists and is finite; for simplicity in our exposition, we assume that 
these densities exist for all /3 > and A e M"^. The associated entropy function, s{u, a), 
of the mean energy, u, and the mean resolved vector a, is determined by the maximum 
entropy principle, 

s{u,a) — max — (logpjp) subject to {H\p) — u, {A\p) — a, (l|p) =1. (11) 

The maximizer in (11) is the quasi-canonical density (9), and the parameter /3 and pa- 
rameter vector —A are Lagrange multipliers for the constraints on mean energy and mean 
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resolved vector, respectively. The entropy function s{u, a) is the negative of the convex 
conjugate function to A) defined in (10); that is, —s{u, a) is the Legendre transform 
of A), and hence all the following relations hold: 

dijj dij) ds ds 

-s{u,a)^-Pu + X*a-m>^), 'q^^^^ = " ' 9^ = ^' "a^^^" 

There are two ways to handle the dependence of the family (9) on /3, or equivalently, 
on the mean energy u = {H\p), and this choice leads to two versions of the best-fit closure 
theory. 

In one version we fix ^ > and set p(A) = p{/3,X). The equihbrium density p(0) is 
the canonical Gibbs ensemble on r„, 

Pe,{z) = Z{^)-^exp{-pH{z)), with Z{I5)= j exp{-/3H{z)) dz . (12) 

We let {F)eq denote expectation of any dynamical variable F with respect to peq. The 
canonical statistical model with fixed inverse temperature therefore consists of the densi- 
ties 

p{z;X) = exp{\*A-(l){\))p,g{z), with 0(A) = log ( exp(AM) . (13) 

Under this choice of trial densities the mean energy {H\p{X)) is allowed to vary along a 
path A = X{t). This version of the model is appropriate to a physical system in which an 
energy reservoir maintains a constant system temperature. 

A standard motivation for using the family of densities (13) is that each member of 
the family minimizes relative entropy subject to the mean value of the resolved vector; 
that is, p solves 

—s{a) = min(log — |p) subject to {A\p) = a, (l|p) = 1. 



p 



Peq 



Prom the perspective of information theory [13, 28], p{z; A) is the least informative proba- 
bility density relative to the equilibrium density Peq that is compatible with the macrostate 
vector a = {A\p). The relative entropy, —s{a), as a function of the mean resolved vector, 
a, is the Legendre transform of 0(A), and the one-to-one correspondence between A and 
a is a convex duality given by 

In statistical inference, (13) is called an exponential family relative to peq with natural 
parameter A [9]. The score vector, ^7(A) = A — a, is the resolved vector centered around 
its mean, a = {A\p{X)). The Fisher information is C(A) = {{A — a){A — a)* \ p(A) ), the 
covariance matrix for the resolved variables. 

In the alternative version we impose the conservation of mean energy E = {H\p{X{t))) 
exactly along paths X{t). This constraint is achieved by allowing /3 = /3(A) to vary with 
A, and setting p(A) = p(/3(A), A). From a physical perspective this version is appropriate 
whenever the model system is isolated. Moreover, it leads to a closed reduced dynamics 
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having precisely the form of GENERIC thermodynamics [37] . For these reasons, we also 
develop this version of the best-fit closure theory in the present paper, even though the 
version with fixed temperature is technically simpler. 

The function ^(A) is determined by the requirement that, for all admissible parameter 
vectors A, 

{H\p{^,X))^-^{^,X)^E, (15) 

where E is the equilibrium energy corresponding to A = 0. The unique solvability of this 
equation for /3 = /3(A) is ensured by the strict convexity of ip. Differentiating (15) with 
respect to A yields 

0^^{H\m = {(H-E)U(Xrp(X)) 
where the score vector in this energy-conserving model is 
U(\) - (A n) iiH-E){A-a)\p) 

U{X) -{A -a) {(H-Em - ■ ^^^^ 

Thus, the score variables f/j(A) arc the centered resolved variables Ai projected onto 
the subspace of L^(r„,p) orthogonal to the centered Hamiltonian H — E. The Fisher 
information matrix (8) is the covariance matrix for these projected resolved variables: 

r(^^ KA n\( a nV\ {(H - E){A - a)){{H - E){A - ay) 

Before proceeding to formulate our optimization principle over paths of macrostates, 
let us first review a naive closure procedure for the quasi-equilibrium probability densities 
p(A) that produces the reversible part of the closed reduced dynamics, but entirely sup- 
presses the irreversible part [20]. Motivated by the moment equations (6), which hold for 
any dynamical variable F and for the exact Liouville solution p{-,t), one may impose the 
A- moments of the Liouville equation on the trial densities p{X{t)). Then the parameter 
path X{t) G M"^ is required to satisfy the m differential equations 

This simple moment closure with quasi-equilibrium densities is memoryless, and conse- 
quently it is entropy conserving. The entropy production calculation is: 

5 = -A*^ = -{L{X*A)\piX)) 



dt dt 



I {X*A,H} e^^[-^H + X*A-^lj{^,X)]dz 



= - / {exp[-pH + X*A-iP{p,X)],H}dz ^ 0. 
The last equality makes use of the general integration-by-parts identity, 

/ {F,,F2}Fsdz ^ - [ {F3,F2}F,dz, 
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which is also used in several calculations to follow. 

The closed reduced equations governing this adiabatic closure can be written as 



da . , , (9s 

^ = /(A), w,th X = (18) 

introducing the vector field 

/(A) = {LA\~p{X)) (19) 
= / {A,H} ex^[-pH + \*A-il:{P,\)]dz 

= -p-^ ! {A, exp[-/3//]} exp[A*A- V(;5,A)]d^ 

= p-^ I {A cxp[A*A-V(/9,A)]} exp[-/3i7](i^ 

= r'(KA}|p(A))A. 

In either version of the model, with fixed ^ or fixed E, it is possible to express /(A) as a 
generalized Hamiltonian vector field. In both cases the energy representation, u — h{s, a), 
which inverts the entropy representation, s = s{u,a), plays a central role. 

For the model with fixed /3, we introduce the associated free energy function 

h{l3,a) = u-/3-h. (20) 

Technically, h is the negative of the convex function conjugate to h{s, a) with respect 
to s, considered as a function of the temperature 9 — /3~^ — dh/ds. A straightforward 
calculation using the properties of the Legendre transform yields 

oa oa 

in which the a-gradient of the free energy, h, is a constant temperature. Substituting this 
expression into (19), we obtain the adiabatic closed reduced equation (18) in the form, 

da ^, .dh 

— = n(a) — . 21 
dt ^ ' da ^ ' 

Here we introduce the Poisson matrix, 

n = ({AA*}|p(A)), (22) 

which we consider as a function of a, recalling that there is a smooth, invertible mapping 
between A and a. The mxm matrix fl is antisymmetric, state-dependent, and not neces- 
sarily invertible. Consequently, the reduced dynamics (21) is a generalized Hamiltonian 
system, or Poisson system, with a possibly degenerate symplectic structure. Nonetheless, 
it determines a well-defined reversible dynamics for any regular set of resolved variables 
[4, 37]. 
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For the energy-conserving model, entropy at fixed mean energy E depends on a alone: 
s — s{E, a). DifTerentiating the identity E = h{s{E, a), a) with respect to a yields 

^ dh ds ^ dh ^. ^ dh 

ds da da da 

As in (21), we thereby obtain the adiabatic closed reduced dynamics in the form 

da ^, . dh , , 

in which the a-gradient of the mean energy, h, is at constant entropy. 

3 Stationary formulation 

The adiabatic closure discussed in the previous section imposes the A-moment of the 
Liouville equation on the quasi-equilibrium probability densities p(A), for X = X{t) G -K™, 
and thereby determines m ordinary differential equations of first order for the evolution 
of the macrostate a{t) = ( ^ | p(A(t) ). This dissipationless closure is clearly deficient as a 
model of the statistical behavior of the resolved vector A e M^, because it is suppresses 
the infiuence of the unresolved variables on the resolved variables. The traditional remedy 
for this deficiency is to obtain closure by taking the A-moment with respect to a family of 
probability densities on phase space having memory [31, 43, 44]. The customary choice is 
given by the so-called theory of nonequilibrium statistical operators, in which the densities 
have the form _ 

POO 

Pe{;t)^ e--^p(-;A(t-T))ee— dr. (24) 
Jo 

Such a density Pe{-,t) is a functional of the path \{t) corresponding to a macroscopic 
evolution. It depends on the values of the resolved vector A at times prior to t via the 
propagator e~'^^ for the Liouville equation in elapsed time r > 0, and includes an artificial 
decay rate e > for the memory. The nonequilibrium statistical operator (24) at time t is 
thus a time- weighted superposition of exactly propagated solutions from quasi-equilibrium 
densities at earlier times t — r. The density (24) satisfies the inhomogeneous Liouville 
equation 

^ T r ~i 

— + Lp, ^ -€[pe- p\ . 

This equation shows that the nonequilibrium statistical operator p^ becomes a formal 
solution of the Liouville equation in the limit as e — )■ 0-I-. The usual procedure to obtain a 
closure in terms of the mean resolved vector a{t) is to impose the following two equations: 

^ = {LA\p,), and a{t) ^ {A\ p,{;t)) ^ {A\ p{. ; X{t))) , (25) 

for sufficiently small e > (or in an appropriate limit as e — )■ 0+). The first equation, 
which is precisely the A-moment of the Liouville equation itself, defines the closed reduced 
equations for the macrostate a; the second equation is a consistency condition between 
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p to pe that determines A in terms of a common mean resolved vector a. Since p^ relies 
on the memory of A under the phase flow, the resulting closure is governed by integro- 
differential equations in a, in which the propagator for the Liouville equation appears 
explicitly. 

The main justification for the closure theory based on nonequilibrium statistical op- 
erators is that an irreversible thermodynamic formalism is achieved. In this theory the 
irreversiblity of the closed reduced dynamics for a is produced by the slowly fading mem- 
ory of the density p^ used to form the instantaneous 74-moment equations. That is, the 
use of rather than p breaks the time-reversal symmetry of the adiabatic closure. The 
reduction obtained in this way, however, is nearly an identity, as opposed to a systematic 
method of approximating the influence of the unresolved variables on the resolved vari- 
ables. Implementation of the reduced equations requires that ensembles of trajectories 
of the underlying Hamiltonian system be computed over a time interval over which the 
memory fades. In situations in which there is a wide separation of time scales between the 
resolved and unresolved variables, a further analysis is required to deduce the autonomous 
ordinary differential equations that govern the macrostate. 

In these respects, the theory of nonequilibrium statistical operators shares the main 
features of the well-known projection methods of nonequilibrium statistical mechanics 
[2, 45, 41, 36]. In that methodology, a stochastic integro-differential equation is produced 
by projecting the Liouville propagators onto the resolved variables. The result is an exact 
identity that includes a time-convolution with a memory kernel, and random forcing 
terms that are related to the kernel via the fluctuation-dissipation theorem. At least in 
the near-equilibrium regime, the outcome of either theory is essentially the same, as are 
the challenges faced when implementing the reduced equations in a speciflc problem. 

For these reasons we pursue a fundamentally different approach in the present work. 
In our approach the quasi-equilibrium densities, p(A), are retained to establish the ther- 
modynamic structure of the reduced dynamics, but moment closure is replaced by an 
optimization procedure over paths of trial densities, p{\{t)). The cost functional in this 
optimization is designed so that the optimal path is that path which is most compati- 
ble with the underlying Hamiltonian phase flow in a statistical, or information-theoretic, 
sense. No other densities need be introduced to yield an irreversible closure, nor is there 
a need for a stochastic representation intermediate between the deterministic microscopic 
dynamics and the macroscopic reduced equations. 

The lack-of-flt cost function that is the basis of this procedure is constructed as follows. 
We introduce the residual of log p with respect to the Liouville operator, namely, 

i?(-;A,A) = (^^ + Ljiogp(.;A(t)). (26) 

[Here and throughout, A = dX/dt.] There arc two related expressions for this statistic R, 
which we call the Liouville residual, that reveal its signiflcance to the closure problem. 

First, for any z & Fn and any smooth parameter path X{t), the log- likelihood ratio 
between the density propagated for a short interval of time At under the Liouville equation 
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and the evolving trial density is 



^^^T^^^^T^ = -(^t)R{z;X{t)Mt)) + 0((Atf ) as At ^ . 

This expansion shows that, to leading order locally in time, —R{z; A(t), A(t)) represents 
the information in the sample point z for discriminating the exact density against the 
trial density [13, 28]. By considering arbitrary smooth paths passing through a point A 
with a tangent vector A, we may consider R{z; A, A) evaluated at 2; as a function of (A, A) 
in the tangent space to the configuration space, and we may interpret it to be the local 
rate of information loss in the sample point z for the pair (A, A). 

Second, for any dynamical variable F : r„ — > 1?, the F-moment of the Liouville 
equation with respect to the trial densities along a path X{t) is 

j^{F\~p{Xm-{LF\~p{Xm = {FR\~p{Xm- 

Thus, while an exact solution p{t) of the Liouville equation satisfies (6) for all F, a trial 
solution p(A(t)) produces a departure that coincides with the covariance between F and 
R. This representation of R furnishes a natural linear structure for analyzing the fit of 
the trial densities to the Liouville equation. 

In light of these two interpretations of the Liouville residual R, we quantify the dynam- 
ical lack-of-fit of the statistical model in terms it. To do so, we consider the components 
of R in the resolved and unresolved subspaces. At any configuration point A e IR"^, let 
Pa denote the orthogonal projection of the Hilbert space L^(r„,p(A)) onto the span of 
the score functions f/i(A), . . . , Um{X), and let Q\ — I — Fx denote the complementary 
projection; specifically, for any F e L^(r„,p(A)), 

PxF = {FU{Xy I p(A)) C{X)-'U{X) . 

where C(A) is given in (8). We declare the lack-of-fit cost function to be 

>C(A, A) = i( {P,Rf I p(A) ) + 1( {WxQxRy \ p(A) ) . (27) 

£ is a weighted, mean-squared norm, with respect to a linear operator Wx on L^(r„,p) 

that assigns relative weights to the resolved and unresolved components of the Liouville 
residual. Wx is assumed to be self-adjoint, and to satisfy WxPx = PxW = Px- It follows 
that the resolved and unresolved subspaces are invariant under Wx. and that unit weights 
are assigned to the resolved subspace. The weights assigned by Wx to the unresolved 
subspace constitute the adjustable parameters in the closure theory. 

A more explicit formula for C is derived as follows. The Liouville residual has zero 
mean, {R\p{X)) — 0, and its orthogonal components are given by 

PxR^iX- C{Xy\ LU{X) I p(A) ) YU{X) , QxR = QAi^logp(A) . 
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The calculation of PxR employs the string of equations, 

([Llogp(A)]C/(A)|p) = / {~p{\),H}U{\)dz 

= - / {U{\),H}p{\)dz 
= -{LU{X)\p). 

In light of these formulas the lack-of-fit cost function can be written in the form, 

£(A,A) = 1[A-C(A)-V(A)]*C(A)[A-C(A)-V(A)] + w(A) , (28) 

where, compatibly with (19), we employ the vector field 

/(A) = {LU{X)\p{X)), (29) 
and we define the non-negative function 

^(A) = \{mQxL\ogp{X)]' I p(A)) . (30) 

We now formulate the stationary version of the optimization principle that defines 
our statistical closure. For an initial time, we consider the dynamical minimization 
problem 

f+ca 

v{Xo) = min / £(A, A) dt , for Aq G iR"* , (31) 

A(to)=Ao Jto 

in which the lack-of-fit cost function (28) defines the cost functional over admissible paths 
A(i), to <t < +00 , in the configuration space of the statistical model, with the constraint 
that each path starts at Aq at time tg- In conformity with optimization and control theory, 
we refer to v{Xq) as the value function for the minimization problem (31) [8, 14, 18]. Since 
the cost function C{X, A) is independent of t, and the optimization extends to infinity in 
time, v{Xq) is independent of to; indeed, the time variable in (31) can be shifted so that 
to is replaced by 0. It is in this sense that we refer to (31) as the stationary formulation 
of the best-fit principle. 

By analogy to analytical mechanics, one may regard (31) as a principle of least action 
for the "Lagrangian" £{X, A) and interpret the first member in (28) as its "kinetic" term 
and the second member as its "potential" term [1, 30]. The kinetic term is a quadratic 
form in the generalized velocities A with positive-definite Fisher information matrix C(A), 
and it is entirely determined by the expressions that also arise in the adiabatic closure 
presented in the previous section. The potential term w{X) embodies the influence of the 
unresolved variables on the resolved variables, and it depends on the weight operator Wx 
that quantifies that infiuence. Accordingly, we call w{X) the closure potential. Of course, 
these mechanical analogies are not literal: C has the units of a rate of entropy production, 
not of an energy, and it is a sum of its "kinetic" and "potential" terms, not a difference. 

An extremal path X{t), with A (to) = Aq, for (31) determines the best-fit evolution 
of the statistical macrostate p(-;A(t)), in the sense that the time-integral of the lack- 
of-fit cost function C is minimized along the path X{t). Thus, the extremal paths for 
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(31) define the estimated, or predicted, evolution in our reduced model, and the closed 
equations governing this evolution are derived from the optimization principle (31). 

The derivation of the closed reduced equations that follow from the optimality con- 
ditions for (31) makes use of Hamilton- Jacobi theory [1, 30, 18, 17]. The value function, 
v{\), on A G iR™", defined by (31), is analogous to an action integral, or Hamilton princi- 
pal function, and it therefore satisfies the associated time-independent Hamilton- Jacobi 
equation, 

H(A.-g=0. (32) 

where H{X, /i) is the Legendre transform of >C(A, A). That is, C and H are convex conjugate 
functions with respect to their second arguments, and 

= ^ = C(A)A - /(A) (33) 
oX 

n{X, n) = A> - £(A, A) = J/i*C(A)- V + /(A)*C(A)- V - w{X) . 

These calculations are straightforward and explicit because £(A, A) is a quadratic function 
of A. According to Hamilton- Jacobi theory, the conjugate variable // = /i(i) along an 
extremal path A = A(t) is given by the relation 

A = -^(A). (34) 

This basic relation together with (33) closes the reduced dynamics along the extremal 
path, in that it yields the (vector) differential equation 

In summary, the choice of a statistical model of trial probability densities p(A) and 
the specification of a weight operator W\ determine the lack-of-fit Lagrangian C and its 
associated Hamiltonian "H, and thereby the value function v{X) for the best-fit optimiza- 
tion principle. All the terms in (35) are then uniquely defined, and this equation governs 
the evolution of best-fit statistical states in the reduced model. The modeled relaxation is 
described by an extremal path X{t) satisfying lim(_^+oo X{t) — , since the trial densities 
p(A) are parameterized so that p(0) is an equilibrium density. 

The value function satisfies the equilibrium conditions 

^(0) = 0, 1^(0) = 0; (36) 

the first is immediate from (31), and the second is a direct consequence of (32). Moreover, 
the Hessian matrix of second partial derivatives of the value function is non-negative, 
semi-definite at equilibrium. 
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In the non-degenerate case when the resolved variables do not contain any conserved 
quantities, this matrix is actually positive-definite. These properties refiect that fact that 
the equilibrium state A = is a local minimum of f (A), and in the non-degenerate case 
a strict local minimum. The value function defined by (31) is the so-called viscosity 
solution of (32) validating these equilibrium conditions. The reader is referred to [17] for 
the modern theory of existence and uniqueness of solutions to Hamilton- Jacobi equations, 
including the time-dependent equations encountered in Section 5. 

The foregoing discussion applies to the most general form of the reduction using any 
regular family of densities /5(A) as the statistical model. We now specialize the results 
contained in (35) to the models that use quasi-equilibrium densities (9), for either fixed 
inverse temperature /3 or fixed mean energy E. 

For the family (13) with fixed /3 > 0, the lack-of-fit Lagrangian takes the form (28) 
with C(A) = {{A - a){A- a)* \ p(A)) and /(A) = {LA \ p(A)). The closure potential can 
be expressed in the explicit form 

w{X) = ^X*{[W,QxLA] mQxLA*] \ p{X) ) A = \x*D{X)X , (37) 

where the second equality defines the mxm Gram matrix D{X). Thus, w{X) is expressed 
as a non- negative quadratic form in A, with a symmetric, semi-definite matrix D{X) which, 
in general, is A-dependent. It is important to note that the weight operator Wx enters 
into the best-fit closure scheme only through the matrix D{X), and so all the adjustable 
parameters introduced into the scheme as weight factors in the lack-of-fit cost function 
are assembled in D{X) alone. The practical utility of this approach to closure rests on the 
requirement that in problems of interest there be a natural choice of weights via Wx that 
leads to a tractable number of adjustable parameters in D{X) and hence in w{X). 

The closed reduced equations governing the evolution of the best-fit macrostate are 

ft ^ ^^^^ " Ix^^^ ' ^^^^ ^ ' ^^^^^ ■ ^^^^ 

Referring to (18) we see that (38) differs from the adiabatic reduced equation derived in 
the previous section by the presence of the gradient of v{X). In the subsequent section, 
this gradient term is shown to be the source of entropy production in the closed reduced 
dynamics, and consequently the best-fit evolution governed by (38) is irreversible. It is 
illuminating to display the Hamilton- Jacobi equation satisfied by v{X), namely. 

The source term in this partial differential equation for v{X) is the closure potential with 
matrix D{X). This matrix quantifies the magnitude of the unresolved Liouville residual 
and, in turn, generates the irreversible part of the closed reduced equation for a{t) via 
(39). In this way the irreversible reduced macrodynamics ensues from the reversible, 
conservative microdynamics. 

The same governing equations hold for the version of the reduction in which the mean 
energy {H \ p) = E is fixed. Specifically, the closed reduced equation governing this version 
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is identical to (38), the only change being that C(A) is replaced by the appropriate matrix 
(17); the terms /(A) and w(A) take the same forms as in the version with fixed ^. 

The reversible term in either of these reduced equations is expressible as a Hamiltonian 
vector field in the same way as in Section 2. That is, the effective Hamiltonian for the 
version with fixed /3 is the free-energy function, h, while for fixed E it is the mean energy 
function, h. 

4 Properties of the reduced model 

The best-fit statistical closures developed in the preceding section result in reduced equa- 
tions for the macrostate a which possess the so-called GENERIC (General Equations 
of NonEquilibrium Reversible-Irreversible Coupling) format, a general framework for the 
equations of thermodynamics and hydrodynamics [21, 38, 37]. Let us consider the for- 
mulation with a fixed mean energy \ p{\)) = E, for which the reversible dynamics is 
given by (23). The governing equation (35) for the best- fit closure has both reversible 
and irreversible terms, which are expressible in the form 

|..,a)|(a)-|(A), A = (4o) 

As in Section 2, s = s{E, a) denotes the entropy relation s = s{u, a) in (11) evaluated at 
fixed energy, E, and the mean energy function, u — h{s,a), is defined by inverting this 
relation. The reversible term in (40) is a Hamiltonian vector field with Hamiltonian h 
and cosympletic matrix, Q; the gradient dh/da is at fixed entropy. The irreversible term 
in (40) is a gradient vector field of the value function v with respect to A = —ds/da. This 
form of the irreversible term conforms precisely with the general, nonlinear GENERIC 
form proposed by Grmela and Ottinger [21, 38]. Other closely related formats have also 
been developed [35, 4, 3]. 

By virtue of (34), the irreversible term in (40) is exactly the conjugate vector jl. This 
correspondence furnishes the thermodynamic interpretation of the vectors A and /i in our 
optimization-based theory. Namely, the variables A^, . . . , A™, which are the "coordinates" 
of the Hamilton- Jacobi theory, may be interpretated as (minus) thermodynamic forces, 
or affinities; and, the variables /ii, . . . , fim, which are the conjugate "momenta", may be 
interpreted as corresponding thermodynamic fluxes. [Our notation introduces a minus 
sign in A compared to much of the physical literature.] Thus, the duality naturally 
attached to our variational principle (31) coincides with the duality between "forces" and 
"fluxes" in nonequilibrium thermodynamics [15, 26]. 

The value function, v{X), which mediates this duality, is closely related to the entropy 
production. Namely, an immediate implication of (40) is the fundamental identity. 

An entropy production inequality follows from this identity, provided that v{X) is convex. 
This convexity is guaranteed whenever the cost function >C(A, A) is jointly convex in the 
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pair (A, A); but, for large |A| the dependence of the coefficient functions C(A) and -D(A), 
as well as the nonhnearity of /(A), may result in a loss of convexity. Nonetheless, the 
convexity of v{X) always holds in some neighborhood of equilibrium, A = 0. Thus, apart 
from a far-from-equilibrium regime in which convexity may be lost, we are assured that 
entropy increases at a rate bounded below by v: 

|>„(A)>0. (42) 

This inequality follows from the fact that = v{0) > f(A) — \*dv/d\{\) for A in the 
domain of convexity of v. The value function is thus found to be the key thermodynamic 
potential in the nonequilibrium reduced model, and its close relation to entropy production 
further reinforces the information-theoretic basis of the optimization principle (31). 

Since nonequilibrium behavior is most readily understood in the near-equilibrium 
regime, we now present the best-fit closure theory in its approximate form for small 
|A|; that is, for paths X{t) in a neighborhood of A = having trial densities p(A(t)) close 
to peq. We assume that the resolved vector A is normalized so that {A)eq = 0. We also 
assume that {AH)eq = 0; this is easily obtained by replacing each Ai by A — ai{H — E) 
for appropriate a^. Under these normalizations the linearization of the closure is identical 
for the versions that fix either ^ or E along paths. 

The near-equilibrium form of the cost function (28) is simply 

£(A,A) = hx-C-^JX]*C[X-C-^JX] + \x*DX, (43) 

Zi Zi 



C = C(0) = (AA*)e,, D = D{^) = {{WQLA){\VQLA')),,, J = Mq) = {{LA)A*),,. 



in which the coefficient matrices are constant, being evaluated at equilibrium; specifically, 

df 

C is a positive-definite symmetric matrix, D is a semi-definite symmetric matrix, and 
J is an anti-symmetric matrix. The cost function (43), which is a quadratic form in 
(A, A), governs the linear relaxation from near-equilibrium initial states Aq, a phenomenon 
treated by familiar hnear-response theory [10, 45] . The value function associated with C 
is therefore also a quadratic form, and in light of (36) it is simply 

v{X) = -X*MX, (44) 

for some constant, symmetric m x m matrix M . The Legendre transform yields the 
following expressions for the conjugate vector of fluxes and the Hamiltonian: 

fi^CX-JX, H{X, fx) = \fjrC-^fi - X*JC-^fi - \x*DX . 

Zi z 

Along an extremal path, A(t), the relation between flux and force is linear, jl = —MX. 
Consequently, the closed reduced equation for near-equilibrium relaxation is the constant- 
coefficient linear system of ODEs, 

^ = (J-M)A, a = CA. (45) 
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The matrix, M, is determined by an algebraic Riccati equation, to which the stationary 
Hamilton- Jacobi equation (32) reduces [29]; namely, 

MC-^M + JC-^M - MC-^J = D . (46) 

The coefficient matrices, C and J, arc determined entirely by the parametric statistical 
model, whereas the source matrix, D, depends on the choice of the adjustable constants 
in the closure. M is the unique, semi-definite matrix solution of (46). The matrix of 
transport coefficients in (45) thus includes an antisymmetric (reversible) part and a sym- 
metric (irreversible) part, and while the irreversible part, —M responds directly to the 
choice of closure matrix D, it is also modified by the reversible part, J, via the coupling 
terms in the Riccati equation. 

Under the near-equilibrium approximation the entropy production identity (41) be- 
comes 

= 2v{\) = / 2C{X{t'),dX/dt')dt' >0, 
dt Jt 

since v is homogeneous of degree 2, and hence \*dv/d\ = 2v. Thus the time-integrated, 
squared-norm of the lack-of-fit along the extremal A equals the entropy production of the 
predicted evolution. In this light it is evident that the dynamical optimization principle 
defining our best-fit closure theory estabfishes the tight relationship between the rate 
of information loss in the reduced dynamics and the quantified lack-of-fit of the trial 
densities to the Liouville equation. The one-to-one correspondence between M and D in 
(46) expresses this relationship for relaxation in a neighborhood of equilibrium. 



5 Nonstationary formulation 

The stationary formulation of the best-fit closure developed in the preceding sections has 
an irreversible term that is time-independent, being the minus gradient of the solution v{\) 
of the time-independent Hamilton- Jacobi equation (32). But when the modeled evolution 
is initiated by a quasi-equilibrium density p(Ao) for a given initial state Aq G M"^, there 
is an early phase of the relaxation in which the entropy production increases from zero. 
After a sufficiently long time, the entropy production and the irreversible fiux approach 
their stationary values. This "plateau" effect can be demonstrated by expanding an exact 
solution of the Liouville equation in a power series in time about t = 0. For small t, the 
mean resolved vector corresponding to the exact solution is given by 

a{t) = (e*^A|p(Ao)) = (^|p(Ao)) + tf{Xo) + 0{f), 

recalling the definition of /(A) in (19). If we attach to a{t) a path of trial densities, 
p{X{t)), with X{t) determined so that a{t) — {A \ p{X{t)) ), then 

Xit) = Ao + tC{Xo)-'f{Xo) +0{t^). 

The entropy production along this path is then 

^{a{t)) ^ -X{tA) ^ XlfiXo) + 0{t); 
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and the constant term in this equation vanishes, 

A*/(Ao) = (L(A*A)|p(Ao)) = / Lp{\^)dz = 0. 

Thus, we see that the entropy production of a perfectly fitted path of trial densities grows 
from zero linearly in time, given an initial density that is itself a trial density. 

Physically, such an initial condition is naturally created by applying constant external 
perturbations with strengths proportional to Aq, . . . , A™ via the resolved variables and 
allowing the system to equilibriate with respect to the perturbed Hamiltonian H—(3~^\qA. 
When these external perturbations are switched off at time i = 0, a relaxation occurs that 
is the object of the modeling exercise. Moreover, in numerical experiments that test a 
closure theory it is often convenient to use initial ensembles defined by quasi-equilibrium 
trial densities. For these reasons, we now modify the defining optimization principle (31) 
to include such a plateau effect. We refer to this as the nonstationary formulation, in that 
the value function v — v{\, t) becomes time dependent. 

The nonstationary optimization principle has the value function 

rti 

v(Ai,ti) = min / £(A,-A)rft, (47) 

A(ti)=Ai Jo 

in which the admissible paths A(t), < t < ti , in the configuration space of the statistical 
model are constrained to terminate at Ai at time ti > 0, while A(0) is unconstrained. 
The integrand in (47) is modified to account for time-reversal, and the admissible paths 
may be viewed as evolving in reversed time, t — ti — t, starting from Ai at r = 0. The 
value function v{Xi, ti) quantifies the optimal lack-of-fit of a time- reversed path connecting 
the current state Ai to an unspecified initial state A(0). By contrast, the stationary 
formulation optimizes over admissible paths in forward time that join the current state 
Aq at time to to unspecified limit states as t ^ -|-oo; convergence of the time integral then 
requires that the extremal tends to equilibrium. In essence, the nonstationary formulation 
ties the value function to the time at which the initial condition is specified, while the 
stationary formulation ties the value function to equilibration at infinite time. 

The value function (47) is the unique solution of the initial value problem (with (A, t) 
replacing (Ai, ti) ) 

dv I dv \ 

— + I A, j = 0, fori>0, with v(A, 0) = , (48) 

which is a time- reversed Hamilton- Jacobi equation. That is, keeping 'H(A, /x) the Legendrc 
transform of >C(A, A) as in (33), the Hamiltonian corresponding to C{\ —A) is ?^(A, — //); 
thus (48) follows from (47). 

In terms of the time-dependent value function, v(A,i), closure is achieved by setting 

m = -fx^^'^)- (^^^ 

This relation is the nonstationary analogue to (34). The nonstationary reduced equations 
are determined by this relation in the same way that (35) follows from (34); namely. 
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But, in contrast to the stationary case, the solution A(t) of (50) is not itself an extremal 
of the defining optimization principle; the extremal paths that define v{\, t) for each (A, t) 
vary with t as well as with A, and they instantaneously determine the fiux of the predicted 
evolution A(t). 

Since 'H(A, n) is positive-definite in fi, the viscosity solution v{X, t) of (48) exists for all 
time t > 0, is unique, and coincides with the value function defined by (47) [17]. Moreover, 
v{X,t) — )■ f(A), the stationary value function, as t ^ +oo. Thus, the nonstationary best- 
fit closure is a natural generalization of the stationary closure that straightforwardly 
includes an intrinsic plateau effect. 

We remark that it is possible to consider solutions of (48) having nonzero initial 
conditions f(A,0). One natural alternative is to optimize over timc-rcvcrsed admissi- 
ble paths X{t) that attach to the given initial state, so that A(0) = Aq. This require- 
ment produces a value function, V{X,t; Xq), that is singular at i = in the sense that 
V{X,t;Xo) ~ (A — Ao)*C(Ao)(A — Ao)/2t as t ^ 0+. A theory based on this value function, 
and using (50) with V replacing v, constitutes another natural formulation of the plateau 
effect. The resulting reduced equations, however, are less attractive as a tractable closure 
because they are inhomogeneous in the mean resolved vector a, they require a more in- 
tricate solution of (48), and their properties are more difficult to deduce. Consequently, 
even though this approach incorporates the plateau effect in a strong form, we prefer the 
nonstationary theory that follows from the homogeneous initial condition v{X, 0) = in 
(48), and that shares the general properties the stationary theory. 

In particular, the entropy production identity (41) and the inequality (42) remain 
valid in the homogeneous, nonstationary closure theory, since the time-dependent value 
function v{X, t) continues to satisfy the equilibrium conditions (36). Moreover, the Hessian 
matrix of second partial derivatives of the value function evaluated at equilibrium, 

satisfies the Riccati differential equation 

+ MC-^M + JC-^M - MC-^J = D , M(0) = 0. (51) 



dt 



The derivation of (51) merely requires taking the matrix of second partial derivatives 
of (48) and evaluating the result at A = 0. This equation imphes that M{t) is semi- 
definite, because the source term D is semi-definite; in the non-degenerate case when D is 
positive-definite, which is the case whenever none of the resolved variables is a dynamical 
invariant, M{t) is positive-definite for t > [29]. The equilibrium point A = is then a 
strict local minimum for ^(A, t) for alH > 0. 

The near-equilibrium linearization of the nonstationary best-fit closure theory has a 
time- varying linear relation between fiux and force along an extremal path, fi = —M{t)X. 
Consequently, the nonstationary version of the closed reduced equation is a nonautonomous 
linear system of ODEs entirely analogous to (45); namely, 

^ = [J-M{t)]X, a = CA. (52) 
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The value function v{X,t) is the time-dependent quadratic form (44). 

As a simple illustration let us consider the near-equilibrium, nonstationary closure for 
a single resolved variable A e IR^, that is, for m = 1. Then C > 0, J = 0, and there is a 
single closure parameter D > 0. The solution to the scalar Riccati differential equation 
(51) and the closed reduced equation (52) are, respectively. 



This solution clearly shows the plateau effect for small t and the exponential relaxation 
for large t. 

The analogous multivariate form of this result holds for m > 1 provided that all 
the resolved variables Ai,...,Am, as well as H, are even under time-reversal; that is, 
^k{(l, —p) = Ak{q,p), for /c = 1, . . . , m. Then, each LA^. is odd, and consequently J = 
{{LA)A*)f,g = 0. And the same holds if all the A^ are odd. In these situations the closed 
reduced dynamics governed by (52) and (51) is diagonalizable. Namely, let V be the 
m X m matrix of eigenvectors for D relative to C; that is, V is the matrix for which 
V*CV = I and V*DV — A, where A is the diagonal matrix whose diagonal consists of 
the associated eigenvalues 71,..., 7^ > 0. In the transformed variables, b{t) = V*a{t) 
and N{t) = V*M{t)V, the closed reduced dynamics (52) and (51) become 



The solution N{t) of this transformed Riccati equation is the diagonal matrix having diag- 



thcory for these resolved variables predicts a linear relaxation that combines a plateau 
effect with an exponential decay for m normal modes with ni characteristic time scales 



In the general near-equilibrium relaxation for which special symmetries are not present 
in the selected resolved variables, the antisymmetric matrix J is nonzero, and it partici- 
pates in the Riccati matrix equation (51) that determines M{t). Consequently, J, which 
defines the reversible part of closed reduced dynamics, also influences the irreversible part 
controlled by M(t). A central prediction of the best- fit closure theory, therefore, is the 
coupling between the reversible and irreversible parts of the closed reduced equations, 
which implies a quantitative coupling between the time scales of reversible oscillations 
and the time scales of irreversible relaxation. 
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6 Discussion 



We have developed a methodology for constructing nonequilibrium statistical models of 
Hamiltonian dynamical systems based on the recognition that such models are essentially 
estimation strategies. When seeking closure in terms of a reduced description, it is de- 
sirable to have the flexibility to expand or contract the set of resolved, or macroscopic, 
variables, while maintaining a consistent approximation strategy. Expansion increases 
accuracy at the expense of model complexity, while contraction improves tractability at 
the expense of model fidelity. Whatever the compromise between accuracy and tractabil- 
ity, the goal of the modeling exercise is to obtain a dynamical closure that is as effective 
as possible for the selected macrovariables. Motivated by these considerations, we have 
developed a best-fit criterion for canonical statistical models associated with any inde- 
pendent set of resolved variables. 

Our optimization principle produces an irreversible, dissipative macrodynamics that 
minimizes a certain lack-of-fit of the selected statistical model to the reversible, conserva- 
tive microdynamics. The structure of the cost function in this principle conforms to the 
separation of the reduced dynamics into reversible and irreversible parts, in that it assigns 
relative weights to the resolved and unresolved components of the Liouville residual. In 
particular, the contribution of the unresolved component to the cost function is the source 
of irreversibility in the best-fit closure, and the rates of relaxation of the macrovariables 
scale with the weights assigned to the unresolved component. 

The closed reduced equations that follow from the optimality conditions have the form 
of governing equations for nonequilibrium thermodynamics. The value function, which 
is the central quantity in the optimization theory, becomes the pivotal quantity in the 
thermodynamic formalism. Specifically, the gradient of the value function with respect to 
the vector of affinities is the vector of corresponding fluxes. The value function is therefore 
the natural extension of the Rayleigh-Onsager dissipation function beyond the range of 
linear irreversible thermodynamics. The main novelty of our variational formulation is 
that this key thermodynamic potential is realized as the minimal lack-of-fit of macroscopic 
paths to the underlying microscopic dynamics. 

The utility of reduced models and statistical closures for any particular system depends 
upon two factors: (i) a convenient set of the resolved variables with some separation 
of time scales between the the resolved (slow) and unresolved (fast) variables; (ii) an 
expression for, or approximation of, the dissipation function in terms of a manageable 
number of adjustable parameters. Traditionally, thermodynamic modeling has tended 
to concentrate on special physical situations where these two requirements are met very 
strongly. But a closure theory can also be worth pursing even in situations that are not 
extremely favorable. For instance, coarse-grained models of turbulent behavior or sub-grid 
scale parameterizations for complex systems in which coherent structures interact with 
fluctuations over a range of scales may necessitate somewhat crude closures, or perhaps 
a hierarchy of closures. In problems of this kind, our optimization approach furnishes a 
systematic method for deriving closures approximations that share the formal structure 
of thermodynamics. 

As one concrete example we mention an implementation of our general methodol- 
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ogy to the truncated Burgers-Hopf equation [27]. This system has been proposed as a 
good testbed for model reduction techniques because it is a relatively simple prototype 
of turbulent fluid systems [33, 34] . The continuum dynamics of the Burgers-Hopf PDE 
is truncated to n Fourier modes, and then a reduction to the lowest m <^ n modes is 
considered. The truncated system, though it has a noncanonical Hamiltonian structure, 
possesses a Gaussian invariant measure, and for n > 20 its computed dynamics are ob- 
served to be ergodic and mixing and to have nearly Gaussian statistics. A best-flt closure 
theory based on Gaussian trial densities produces explicit, closed reduced equations for 
the relaxation of the m resolved modes from noncquilibrium initial conditions. These 
equations have the form of the m-mode truncation itself but with strong mode-dependent 
dissipation and modified nonlinear modal interactions. Even though this closure has only 
one adjustable parameter and the time-scale separation is meager, the predictions of the 
best-fit closure with 5 resolved modes agree quite well with direct numerical simulations 
of large ensembles of solutions of a 50-mode dynamics. Moreover, this agreement holds for 
far-from-equilibrium initial conditions as well as ncar-to-equilibrium conditions. In light 
of the structure of the Burgers-Hopf dynamics — essentially a quadratic nonlinearity of 
hydrodynamic type — these results suggest that more realistic systems of this kind may 
be amenable to a best-fit closure. 

The present paper has focussed exclusively on the relaxation from noncquilibrium 
statistical initial conditions of conservative, deterministic dynamical systems. It remains 
to develop the analogous theory for forced systems with nonconservative or stochastic 
dynamics. In the latter context, a promising connection exists between our approach and 
the studies of nonequilibrium steady states of driven diffusive systems by Bertini et al. 
[5, 6, 7]. In that work a natural variational principle over paths also arises as well as an 
associated Hamilton- Jacobi equation. In contrast to our approach, though, these authors 
are primarily concerned with characterizing theoretically the thermodynamic limits of 
stochastic microscopic models, not with deriving computationally tractable, predictive 
reduced models of complex dynamical systems. Nonetheless, the common features of all 
these reduction strategies and nonequilibrium thermodynamic structures would benefit 
from further investigations. 
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